Matrix Transposition on a Mesh with Blocking Transmissions
نویسندگان
چکیده
A time-optimal procedure to transpose in situ a matrix stored over a distributedmemory 2-dimensional mesh connected parallel computer is shown. The matrix need not be square. Only nearest-neighbor blocking communications is used, and small bounded buffer space is required.
منابع مشابه
cient Matrix Multiplication Using Cache Conscious Data Layouts
This paper demonstrates performance improvements for matrixmultiplication and mesh generation for Finite Element Method (FEM) by optimizing the memory hierarchy of traditional processors. The theory developed earlier is used to perform such optimizations. Our work provides a uniform methodology across multiple HPC platforms for optimizing the performance of the kernel codes (such as matrix tran...
متن کاملالگوریتم مستطیل آبشاری و ماتریس انتقال در شبکه های کوتاه ترین مسیر بادور
Shortest path problem is among the most interesting problems in the field of graph and network theory. There are many efficient matrix based algorithms for detecting of shortest path and distance between all pairs of this problem in literature. In this paper, a new exact algorithm, named Cascade Rectangle Algorithm, is presented by using main structure of previous exact algorithms and developin...
متن کاملA Novel Multicast Tree Construction Algorithm for Multi-Radio Multi-Channel Wireless Mesh Networks
Many appealing multicast services such as on-demand TV, teleconference, online games and etc. can benefit from high available bandwidth in multi-radio multi-channel wireless mesh networks. When multiple simultaneous transmissions use a similar channel to transmit data packets, network performance degrades to a large extant. Designing a good multicast tree to route data packets could enhance the...
متن کاملOn the Impact of Beamforming on Interference in Wireless Mesh Networks
For a wireless mesh network with randomly placed devices, we analyze the impact of beamforming on the statistics of the signal attenuation, interference, and SIR between devices. We show that simple random direction beamforming performs equally well as omni-directional transmission if no MAC layer is present. Optimized beamforming between transmitter-receiver pairs yields a much better SIR than...
متن کاملAn Efficient Transposition Algorithm
Data transposition is required in many numerical applications. When implemented on a distributed-memory computer, data transposition requires all-to-all communication, a time consuming operation. The Direct Exchange algorithm , commonly used for this task, is ineecient if the number of processors is large. We investigate a series of more sophisticated techniques: the Ring Exchange, Mesh Exchang...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Parallel Processing Letters
دوره 8 شماره
صفحات -
تاریخ انتشار 1998